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Abstract 

Standard  Galerkin  finite  element  methods  for  variably  saturated  groundwater  flow 
have  several  deficiencies.  For  instance,  local  oscillations  can  appear  around  sharp 
infiltration  fronts  without  the  use  of  mass-lumping,  and  velocity  fields  obtained  from 
differentiation  of  pressure  fields  are  discontinuous  at  element  boundaries.  Here,  we 
consider  conforming  finite  element  discretizations  based  on  a  multiscale  formula¬ 
tion  along  with  recently  developed,  local  postprocessing  schemes.  The  resulting  ap¬ 
proach  maintains  the  basic  flexibility  and  appeal  of  traditional  finite  element  meth¬ 
ods,  while  controlling  nonphysical  oscillations  and  producing  element-wise  mass- 
conservative  velocity  fields.  Accuracy  and  efficiency  of  the  proposed  schemes  are 
evaluated  through  a  series  of  steady-state  and  transient  variably  saturated  ground- 
water  flow  problems  in  homogeneous  as  well  as  heterogeneous  domains. 

Key  words:  Richards’  equation,  finite  element  method,  multiscale  stabilization, 
local  conservation 


1  Introduction 


Richards’  equation  is  a  widely  studied  nonlinear  parabolic  equation  describ¬ 
ing  water  flow  in  variably  saturated  porous  media  [1].  Analytical  solutions  for 
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Richards’  equation  are  limited  in  number  and  applicability  dne  to  its  non¬ 
linearity  and  the  degree  of  spatial  heterogeneity  found  in  many  problems  of 
interest  [2],  Numerical  solutions  of  Richards’  equation  are  therefore  important 
for  modeling  groundwater  flow  and  contaminant  transport  in  the  subsurface. 

The  qualities  that  make  it  difficult,  if  not  impossible,  to  solve  analytically 
also  contribute  to  the  challenge  of  developing  accurate  and  efficient  numerical 
methods  for  Richards’  equation.  In  order  to  be  robust,  spatial  and  tempo¬ 
ral  approximations  must  be  capable  of  resolving  sharp  fronts  infiltrating  dry 
media  and  handling  transitions  to  elliptic  or  nearly  elliptic  conditions  in  sat¬ 
urated  regions  [3].  Regardless  of  the  discretization  used,  the  result  is  almost 
always  a  large  nonlinear  system  that  is  hard  to  solve  efficiently  [4,  5]. 

While  finite  difference,  finite  volume,  and  finite  element  approximations  are 
all  regularly  used  for  Richards’  equation  [6-8],  the  focus  here  is  on  finite  ele¬ 
ment  methods  (FEMs),  which  are  appealing  because  they  are  well-suited  for 
unstructured  spatial  meshes  and  so  can  be  readily  applied  to  irregular  do¬ 
mains  and  facilitate  some  classes  of  adaptive  refinement  [9].  Classical  FEMs 
do  not  perform  well  for  Richards’  equation.  Most  notably,  lumping  and  up- 
winding  schemes  are  required  to  prevent  significant  non-physical  oscillations 
[6,  10]  and  straightforward  evaluation  of  Darcy’s  law  leads  to  an  approximate 
groundwater  velocity  field  that  is  not  locally  conservative  over  mesh  elements. 
By  this  we  mean  that  normal  fluxes  are  discontinuous  across  element  bound¬ 
aries,  and  the  velocity  field’s  divergence  over  an  element  does  not  balance  the 
discrete  mass  accumulation  [11], 

Many  remedies  to  these  deficiencies  have  been  proposed.  Mass  lumping  and 
upwinding  are  commonly  introduced  to  smear  steep  transition  zones,  while 
postprocessing  schemes  or  mixed  FEM  formulations  have  been  considered  to 
obtain  locally  conservative  velocity  fields  [6,  10,  12-14],  An  exhaustive  review 
of  available  discretization  techniques  is  beyond  the  scope  of  this  work.  A  review 
of  recent  methods  can  be  found  in  [15] 

The  purpose  of  this  work  is  to  evaluate  two  complementary  strategies  for  im¬ 
proving  conforming  Galerkin  (CG)  FEM  approximations  of  Richards’  equa¬ 
tion  on  simplicial  meshes  in  one,  two,  or  three  dimensions.  Specifically,  we 
consider  a  multiscale,  stabilized  finite  element  formulation  [16]  and  a  pair  of 
local  postprocessing  techniques  from  [17]  and  [18]  that  produce  element-wise 
conservative  velocity  fields.  The  approaches  are  also  orthogonal  in  the  sense 
that  the  velocity  postprocessing  methods  are  applicable  for  generic  CG  ap¬ 
proximations  [17]  or  fairly  general  data  sets  [18],  and  the  stabilized  solution 
is  not  restricted  to  the  proposed  postprocessing  [19].  On  the  other  hand,  the 
combined  approach  is  appealing,  since  it  maintains  the  basic  flexibility  and 
simplicity  of  traditional  FEMs  while  controlling  nonphysical  oscillations  and 
producing  element-wise  conservative  velocity  fields. 
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This  paper  is  organized  as  follows.  In  the  next  section  we  review  relevant 
work  on  FEMs  and  locally  conservative  velocity  approximations.  In  §3  we 
summarize  our  formulation  of  Richards’  equation,  while  multiscale  stabilized 
CG  methods  based  on  the  work  of  [20]  are  described  in  §4.1  -  §4.3,  and  al¬ 
gorithms  for  obtaining  locally  conservative  approximations  based  on  [17,  18] 
are  presented  in  §4.4.  Finally,  in  §5  we  provide  numerical  results  comparing 
these  methods  using  several  error  and  computational  work  measures,  followed 
by  discussion  and  conclusions. 


2  Background 


2. 1  Finite  element  approximations  for  Richards  ’  equation 


A  variety  of  FEM-based  schemes  have  been  applied  to  Richards’  equation.  For 
instance,  [6]  presented  a  CG  method  for  Richards’  equation  based  on  modified 
Picard  linearization  for  temporal  derivatives  and  mass-lumping,  which  corre¬ 
sponds  to  a  reduced  order  quadrature  formula  for  the  mass  accumulation  term. 
The  resulting  approximation  is  globally  conservative  with  good  control  of  spu¬ 
rious  numerical  oscillations.  Along  with  the  similar  “chord-slope”  technique 
from  [21],  this  can  be  considered  the  standard  approach  for  treating  accumula¬ 
tion  terms  in  Richards’  equation.  Later,  [10]  demonstrated  that  mass-lumping 
combined  with  appropriate  “upwinded”  relative  permeability  evaluation  can 
lead  to  monotone  approximations  independent  of  mesh  resolution. 

Standard  CG  schemes  do  not  directly  yield  element-wise  conservative  veloc¬ 
ity  fields  as  defined  above,  and  an  additional  postprocessing  step  is  needed 
[22,  23,  47].  This  complication  is  avoided  by  finite  element  formulations  that 
include  an  explicit  velocity  representation,  such  as  mixed  FEMs  and  local  dis¬ 
continuous  Galerkin  (LDG)  methods  [24,  25].  While  they  have  proven  success¬ 
ful  for  many  subsurface  flow  problems  including  Richards’  equation  [12,  14,  26], 
mixed  FEMs  are  not  without  drawbacks.  The  resulting  linear  systems  are  sad¬ 
dle  point  problems  without  hybridization  [24],  and  the  number  of  unknowns 
is  greater  than  a  nodal  CG  approximation  of  the  same  order  on  the  same 
mesh  [25].  As  with  CG  methods,  overshoot  and  undershoot  can  also  occur 
for  sharp  front  problems  without  upwinding  and  reduced  order  quadrature 
approximations  [4],  Similarly,  LDG  approximations  can  exhibit  overshoot  and 
undershoot  when  direct  evaluation  of  relative  hydraulic  conductivities  is  used 
without  upwinding  [3].  It  should  be  noted  that  this  behavior  is  a  function 
of  grid  resolution.  For  sufficiently  refined  meshes,  oscillations  are  eliminated. 
Sufficient  resolution  is,  however,  unreachable  in  many  cases  without  the  use 
of  adaptive  mesh  refinement  [12,  27]. 
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A  thorough  evaluation  of  the  relative  merits  of  mixed  and  CG  FEM  discretiza¬ 
tions  for  Richards’  equation  is  beyond  the  scope  of  this  work.  Direct  compar¬ 
isons  between  a  mixed  hybrid  FEM  and  a  traditional,  local  velocity  postpro¬ 
cessing  technique  [28]  for  saturated  groundwater  flow  can  be  found  in  [29-31]. 
Broadly  speaking,  the  added  degrees  of  freedom  typically  associated  with  a 
mixed  method  must  be  weighed  against  the  accuracy  and  generality  of  a  given 
postprocessing  technique  for  CG  FEMs.  Moreover,  these  approaches  are  not 
necessarily  mutually  exclusive  (see  for  instance  the  combination  of  CG  and 
LDG  approximations  in  [25]).  In  our  view,  a  fully  satisfactory  approach  has 
not  been  reached,  since  existing  mixed  and  CG  discretizations  require  some 
type  of  lumping  and  reduced  order  approximation  to  avoid  oscillation. 


2.2  Variational  multiscale  finite  element  methods 


Upwinding  and  mass  lumping  have  been  recognized  for  some  time  as  an  at¬ 
tempt  to  properly  account  for  “subgrid”  effects  in  discretizations  [32],  From 
this  perspective,  it  is  unresolved  features  in  the  solution  that  produce  quali¬ 
tatively  incorrect  features  such  as  spurious  local  maxima  and  minima.  Stabi¬ 
lized  methods  like  the  streamline  upwind  Petrov  Galerkin  (SUPG)  discretiza¬ 
tion  were  originally  introduced  as  artificial  viscosity  methods  that  attempted 
to  introduce  minimal  amounts  of  numerical  viscosity  to  control  oscillations 
on  coarse  meshes  [33].  In  the  last  twenty  years,  stabilized  FEMs  have  been 
widely  used  within  the  computational  fluid  dynamics  community  for  modeling 
advection-diffusion-reaction  systems  as  well  as  incompressible  Navier-Stokes 
problems  [34-36]. 

Residual-based,  stabilized  finite  elements  were  recast  as  the  multiscale  vari¬ 
ational  method  in  [20].  This  provided  a  framework  for  relating  unresolved 
solution  components  and  stabilization  terms  used  to  supplement  standard 
Galerkin  formulations.  Variational  multiscale-based  approximations  have  since 
become  increasingly  popular  [16,  37,  38]  and  were  applied  to  two-phase  flow 
in  porous  media  in  [39]  as  well  as  [40]. 

The  focus  of  traditional  stabilized  methods  was  improved  numerical  approx¬ 
imation,  particularly  the  resolution  of  internal  or  boundary  layers  [33,  41]. 
This  is  also  the  perspective  taken  in  [40].  The  emphasis  of  [42]  as  well  as  the 
related  multiscale  finite  element  and  finite  volume  methods  from  [43-46]  is  on 
incorporation  of  fine-scale  heterogeneity  into  the  resolved  (or  coarse-scale)  so¬ 
lution.  In  the  development  below,  we  will  focus  on  improved  resolution  of  sharp 
fronts,  since  this  is  a  particular  challenge  in  extending  numerical  methods  for 
saturated  flow  to  Richards’  equation.  To  our  knowledge  the  variational  multi- 
scale  framework  has  not  been  applied  to  Richards’  equation,  nor  has  the  issue 
of  obtaining  locally  conservative  groundwater  velocity  holds  from  stabilized, 
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conforming  FEMs  been  addressed. 


2.3  Velocity  postprocessing  techniques 


As  mentioned  above,  standard  CG  methods  for  groundwater  flow  do  not  di¬ 
rectly  yield  locally  conservative  velocity  approximations.  A  global  postpro¬ 
cessing  technique  can  be  found  in  [13]  while  a  local  postprocessing  algorithm 
based  on  dual  meshes  and  stream  functions  can  be  found  in  [28] .  On  the  other 
hand,  the  recent  popularity  of  discontinuous  Galerkin  methods  has  renewed 
interest  in  local  conservation  for  CG  discretizations.  The  general  conservation 
properties  of  CG  methods  for  advection-diffusion-reaction  equations  and  in¬ 
compressible  Navier-Stokes  were  investigated  in  [19,  22],  Furthermore  a  gen¬ 
eral  method  for  obtaining  locally  conservative  velocities  from  standard  CG 
solutions  was  outlined,  which  required  the  solution  of  a  global  linear  system. 

Here,  we  focus  on  two  recent  postprocessing  schemes  from  [IT]  and  [18]  that 
can  be  computed  locally  and  are  quite  general  in  their  domain  of  applica¬ 
bility.  The  two  methods  are  related,  since  they  arise  from  equivalent  global 
formulations  based  on  piecewise  constant  corrections  to  a  velocity  held  that 
is  singly  defined  on  element  boundaries,  but  not  locally  conservative  element¬ 
wise  [17,  18].  The  approaches  differ  in  that  [17]  focuses  explicitly  on  velocities 
obtained  from  CG  solutions,  while  [18]  addresses  more  general  velocity  fields 
(e.g.,  non-conservative  velocity  fields  arising  from  held  measurements  as  well 
as  numerical  methods). 

The  approach  taken  in  [17]  is  based  on  the  idea  of  enriching  the  solution 
space  with  additional  discontinuous  degrees  of  freedom  after  a  CG  solution 
has  been  obtained.  The  CG  solution  is  modified  strictly  to  satisfy  the  local 
conservation  property,  which  results  in  a  postprocessing  method  that  depends 
only  on  the  mesh  and  information  left  over  from  the  CG  solve.  A  local  approach 
[17,  algorithm  2]  is  derived  by  using  piecewise  linear  enrichment  and  domain 
decomposition.  The  result  is  a  locally  conservative  velocity  held  defined  on 
element  boundaries  (i.e. ,  a  numerical  trace)  after  directly  solving  small  linear 
systems  over  the  mesh  node  stars,  which  are  simply  the  collection  of  elements 
sharing  a  given  node  for  each  node  in  the  mesh. 

The  perspective  taken  in  [18]  is  to  minimize  the  correction  necessary  to  obtain 
local  conservation  in  a  suitably  chosen  norm.  A  Gauss-Seidel  like  iteration 
is  then  derived  which  requires  strictly  local  updates  using  coefficients  that 
depend  only  on  mesh  information.  The  overhead  for  an  iteration  of  the  Sun- 
Wheeler  Gauss-Seidel  approach  is  less  than  that  of  [17],  since  it  uses  element 
interface  connectivity  and  does  not  require  solution  of  local  systems.  On  the 
other  hand,  the  algorithm  is  iterative  with  a  convergence  rate  that  depends 
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on  the  mesh  [18]. 


In  both  approaches,  the  output  is  a  locally  conservative  velocity  held  defined 
on  element  boundaries.  This  may  be  sufficient  for  many  applications,  like 
use  of  the  velocity  for  a  finite  volume  transport  calculation.  Alternatively,  the 
element  boundary  data  can  be  used  to  define  a  local  Neumann  problem  on  each 
element  that  can  then  be  solved  with  a  mixed  finite  element  space  to  obtain 
a  conservative,  globally  defined  velocity  held.  In  some  cases,  like  the  lowest 
order  Raviart-Thomas  space,  the  local  step  reduces  to  a  simple  projection[18]. 
With  a  global  mixed  FEM  velocity  representation,  not  only  local  conservation 
but  also  higher  order  compatibility  with  transport  discretizations,  in  the  sense 
of  [47],  is  then  available. 


2-4  Nonconforming  finite  element  approximation 


To  better  evaluate  accuracy  and  computational  expense  for  the  postprocessing 
techniques  from  §4.4.1  and  §4.4.2,  we  would  like  to  compare  them  to  veloci¬ 
ties  obtained  from  mixed  finite  element  approximations,  since  these  are  well- 
established,  locally  conservative  methods  [24,  48].  For  simplicity,  we  follow  an 
approach  similar  to  that  taken  in  [49]  and  exploit  a  correspondence  between 
P 1  nonconforming  and  mixed  hybrid  FEM  approximations  [50]. 

To  be  more  specihc,  a  very  simple  postprocessing  step  was  shown  in  [51]  to 
give  correspondence  between  the  standard  P 1  nonconforming  approximation 
and  a  mixed  hybrid  FEM  solution  for  elliptic  problems  with  isotropic  conduc¬ 
tivity  and  a  piecewise  constant  data  assumption.  Arbogast  et.  al  [52]  showed 
correspondence  between  mixed  and  nonconforming  methods  for  a  larger  class 
of  elliptic  problems  including  lower  order  terms.  In  general,  the  correspon¬ 
dence  requires  supplementing  standard  nonconforming  spaces  with  bubbles 
and  a  series  of  L2  projections  in  the  weak  formulation.  On  the  other  hand, 
[53]  presented  a  postprocessing  approach  in  which  the  RTO  velocity  and  po¬ 
tential  could  be  obtained  via  postprocessing  of  P 1  nonconforming  solutions 
for  the  potential  variable  without  reference  to  bubble  functions  in  the  case  of 
full  tensor  conductivities  as  well  as  quasi-linear  elliptic  equations.  This  is  the 
approach  followed  below  for  comparison  with  the  CG  velocity  postprocessing 
techniques. 


3  Richards’  equation 


We  begin  with  a  mass  conservation  statement  for  the  aqueous  phase  in  an  air- 
water  system  where  the  solid  phase  is  assumed  to  be  immobile  and  interphase 
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mass  transfer  is  negligible 


mt  +  V  •  (pq)  =  b,  for  x,  t  G  fl  x  [0,  T] 

m  =  p9  (1) 

along  with  the  standard  extension  of  Darcy’s  law  to  variably  saturated  condi¬ 
tions  [54] 


q  =  -krKs  (V-0  -  pgu)  (2) 

ffere  0  and  6  are  the  pressure  head  and  volume  fraction,  respectively,  and 
P  —  q/Qo  is  a  normalized  density.  The  relative  permeability,  kr  is  assumed  to 
be  a  scalar,  but  the  saturated  conductivity  Ks  is  not.  b  is  a  generic  source  term, 
and  g(i  is  a  unit  vector  accounting  for  the  direction  of  gravity  [11].  Finally,  fl 
represents  the  spatial  domain  and  [0,T]  the  time  interval  of  interest. 

An  equation  of  state  for  g  and  constitutive  ( p-s-k )  relations  are  necessary  to 
close  eqns  (1)  and  (2).  In  the  following,  we  assume  that  the  aqueous  phase 
density  can  be  written 


g=g0e^-^) 


(3) 


for  a  reference  pressure  head,  0 o-  We  also  adopt  standard  van  Genuchten  [55] 
and  Mualem  [56]  relations  to  describe  the  interdependence  of  fluid  pressure, 
saturation,  and  the  relative  permeability,  although  the  approaches  considered 
are  not  restricted  by  these  choices.  For  -0  <  0,  the  van  Genuchten-Mualem 
p-s-k' s  are 


(9  -  9r) 


( 9S  -  9r) 

[i  +  (aj0ir-rm- 


i-  a 


l/mvg 


(4) 


where  se  is  the  effective  saturation,  9r  is  the  residual  volumetric  water  content, 
9S  is  the  saturated  volumetric  water  content,  avg  is  a  parameter  related  to  the 
mean  pore  size,  nvg  is  a  parameter  related  to  the  uniformity  of  the  soil  pore- 
size  distribution,  and  rnvg  =  1  —  1  jnv.  For  0  >  0,  the  porous  medium  is  fully 
saturated,  and  eqn  (4)  reverts  to  se  —  1  and  kr  =  1. 


Initial  and  boundary  conditions  for  eqns  (1)  and  (2)  are  written 


0(x,O)=0° 


(5) 
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^(x,  t)  =  'ipb,  xeTo  (6) 

pq-n  =  a\  x  G  T^  (7) 

where  T  =  <9h2  =  rDUTjv,  ^01^  =  0,  and  n  is  the  unit  outer  normal  on 

on. 

Several  variations  of  eqns  (1)  and  (2)  are  common  depending  on  the  choice  of 
solution  variable  (ip  or  6)  and  the  manner  in  which  the  accumulation  term  in 
eqn  (1)  is  handled.  Here,  we  follow  the  basic  approach  outlined  in  [7],  which 
can  be  seen  as  a  generalization  of  the  standard  mixed  form  [6].  That  is,  we 
take  ip  as  the  dependent  variable,  apply  temporal  approximations  directly  to 
rrit  rather  than  using  the  chain  rule,  and  rely  upon  the  nonlinear  solution 
approach  to  resolve  the  dependence  of  m  on  ip  [7,  11]. 


4  Numerical  methods 


Starting  from  the  basic  mass-conservative  formulation  in  §3,  we  describe  a 
multiscale-stabilized  finite  element  method  for  Richards’  equation  below.  We 
then  present  straightforward  extensions  of  postprocessing  algorithms  from  [17] 
and  [18]  to  obtain  locally  conservative  velocity  fields  from  CG  solutions  of 
nonlinear  scalar  parabolic  PDEs.  Lastly,  we  summarize  our  approaches  for 
time  integration  and  solving  the  linear  and  nonlinear  systems  that  arise  from 
the  multiscale  finite  element  discretization. 


4- 1  Weak  formulation 


In  an  attempt  to  simplify  presentation  of  the  numerical  methods  below,  we 
Erst  substitute  eqn  (2)  into  eqn  (1)  and  write  the  result  as  a  generic  nonlinear 
advection-diffusion  equation 


mt  +  V  •  (f  —  aVip)  =  b 


(8) 


where 


f  p  krKsgu, 
a  =  pkrKs 

<t  =  f  —  flV  ip  —  pq  (9) 


We  then  apply  Hughes’  variational  multiscale  paradigm  [20]  to  eqn  (8)  in  a 
manner  similar  to  the  approach  taken  in  [40].  To  begin,  trial  solutions  are 


sought  in 


V  =  {v  G  H\n)  :v  =  pb  on  TD}  (10) 

while  test  functions  are  taken  from 

W  =  {w  G  if1  (ft)  :  w  =  0  on  TD}  (11) 

where  H1( ft)  is  the  Sobolev  space  of  functions  that  are  square  integrable  and 
have  square-integrable  first  derivatives  over  ft.  For  t  G  [0,  T]  hxed,  a  solution 
ijj  E  V  is  sought  such  that 


J  [mt  +  V  •  (f  —  aVijj)  —  b\  w  dx  =  0,  (12) 

n 

J  -0(x,  0)tcda;  =  J  ^°(x)tc  dx,  Vw  G  W  (13) 

Cl  Cl 

Integrating  eqn  (14)  by  parts  gives  the  weak  statement,  find  ijj  G  V  such  that 


J  mtw  d x  ~  J  (f  —  aVip)  ■  Vic  dx  +  J  crbw  ds  —  J  bw  dx 


rN 


ci 


=  0,  Vic  G  W 


(14) 


4-2  Temporal  approximation 


Before  applying  a  finite  element  approximation  we  discretize  in  time,  following 
[57],  Specifically,  we  use  fully  implicit  backwards  difference  formula  (BDF) 
methods  [58]  in  which  all  terms  in  eqn  (14)  are  approximated  at  the  new  time 
level  tn+1  and  the  accumulation  term  is  approximated  as 

rnt  «  frit  =  an+1m('0n+1)  +  /3n  (15) 


Here  an+1  depends  on  the  order  of  approximation  and  time  step  history,  while 
P  depends  on  the  order  of  approximation,  time  step  history,  and  solution 
history.  For  instance,  with  a  first  order  (backward  Euler)  approximation,  a  = 
1/A tn+1,  pn  =  -m(f)/Afn+1  for  A tn+1  =  tn+1  -  tn. 
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4-3  Multiscale  finite  element  approximation 


In  order  to  present  the  multiscale  formulation,  we  insert  some  notation.  Let 
Th  be  a  simplicial  triangulation  of  hi  in  Mnd,  n(j  =  2,  3,  containing  Ne  elements, 
{fle},  e  —  1,. . . ,  Ne,  Nf  element  boundaries,  or  faces,  {7/},  f  =  1, ,  Nf,  and 
Nn  nodes,  {xn},  n  —  1, . . . ,  Nn .  The  collection  of  faces  in  the  domain  interior 
is  denoted  Tj.  We  also  assume  that  the  intersection  of  elements  Oe,  0e/  G  7),, 
is  either  empty,  a  unique  7/  G  T/,  an  edge  (for  M3),  or  a  point.  The  diameter 
of  Vte  is  he  and  its  unit  outer  normal  is  written  ne. 

The  multiscale  view  of  stabilization  involves  splitting  V  and  W  into  resolved 
and  unresolved  scales 

V  =  Vh®6V  (16) 

W  =  Wh®SW  (17) 

In  our  case  14  and  IT/,,  are  just  the  usual  conforming  piecewise  linear  Galerkin 
spaces 


vh  =  {vh  evn  C°(Q)  :  vh\ne  G  P1^)}  (18) 

Wh  =  {wh  G  IT  n  C°(Q)  :  wh\Qe  G  P\ne)}  (19) 

while  SV  and  5W  remain  inhnite  dimensional.  The  solution  is  then  written 
uniquely  as  fi’h  +  and  the  subgrid  error,  Sift,  is  approximated  using  local 
problems  over  each  element.  Note  that  we  also  assume  that  the  boundary  and 
initial  data,  eqns  (5)-(7),  can  be  accurately  approximated  at  the  grid-scale. 

Inserting  eqns  (15)  and  (19)  into  eqn  (14)  and  taking  advantage  of  its  linearity 
in  w  allows  us  to  write  a  coupled  problem  for  the  solution  'fih  +  at  tn+1 


Fh  —  j  mtWh  d x  —  J  (f  —  aSJfi)  ■  Wwh  dx  +  J  <7bWh  ds  —  J  bwh  dx 

n  t  j\[  ^2 

=  0,  Vw/j  G  Wh  (20) 

Fs  =  Jmt6wdx  —  J  (f  —  aSJ'fi)  ■  W5w  da:  +  J  ab5w  ds  —  J  bdw  dx 


n  ci 

=  0,  W5w  G  SW 


(21) 


where  we  have  dropped  time  level  indicators  for  convenience,  with  the  under¬ 
standing  that  all  terms  are  evaluated  at  tn+l .  Before  proceeding,  we  point  out 
that  our  goal  is  to  obtain  a  modified  version  of  eqn  (20) 
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(22) 


Gh  =  Fh-J2  I  C*shwhT  IZhfyh)  dx  =  0,  \/wh  e  Wh 

e  He 


and  a  corresponding  linearized  system  of  equations  to  use  in  a  Newton  solution 
algorithm  for  Here,  7 Zh  is  an  approximation  to  the  strong  residual 


1Z  —  rht  +  V  •  (f  —  aV'ip)  —  b  (23) 

and  C*sh  approximates  the  formal  adjoint  of  a  linear  operator  Cs  defined  below. 

In  order  to  linearize  eqns  (20)  and  (21),  we  label  the  Newton  increment  v  = 
i>h  +  Sv  and  seek  an  approximate  solution  ip+  =  +  v.  We  linearize  around 

the  coarse-scale  quantity  at  each  iteration  and  assume  Sv~  =  0  [40].  The 
coupled,  linearized  problem  is 


J  m!tvwh  dx  —  j  (f'v  —  a'Vip^v  —  aVvj  ■  Vwh  dx  = 
n  o 

Vw/j  e  Wh 

J  rh'tvSw  dx  —  J  (f'v  —  a'V'tp^v  —  aVrj  •  WSw  dx  = 


-F; 


\/Sw  e  SW 


(24) 


(25) 


Here,  the  '  symbol  represents  differentiation  with  respect  to  ip  and  nonlinear¬ 
ities  are  evaluated  at  ipfp.  Below,  a  superscript  ~  is  used  to  denote  evaluation 
at  ipfp  where  necessary  to  avoid  confusion. 


4-3.1  Subgrid-scale  approximation 

We  make  a  number  of  modeling  assumptions  to  obtain  a  computationally 
tractable  subgrid-scale  approximation.  First,  we  take  a  domain  decomposition 
approach  to  obtain  local  (element)  problems  for  Sv.  This  is  accomplished  by 
assuming  5w  —  Sv  —  0  on  dSle  for  all  e.  On  each  Oe,  eqn  (25)  can  then  be 
written 


J  m'fSvSw  dx  —  J  (f 'Sv  —  a'Wiph  Sv  —  aVSv )  •  VSw  dx  = 

rig  rig 

—  J  m'tvh5wdx  +  j  (fvh-a'Vi,-hvh-aWh)-V6wiX-Fle  (26) 


where 
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(27) 


Fsfi  =  I  rhtSw  dx  —  J  (f  —  aV^)  •  VSw  dx  —  J  bSw  dx 


Integrating  the  second  term  on  the  right  hand  side  by  parts  gives 


J  rh't5vSw  dx  —  J  (i'Sv  —  a'\/^h  5v  —  aV6vj  ■  W5w  dx  - 


—  /  m'tVhSwdx  —  /  V  •  f'vh  —  a/'VriJjhVh  —  aVv 


8w  dx  —  F> 


S,e 


(28) 


which  motivates  definition  of  the  linearized  operator 


Cv  =  rrifV  +  V  •  i'v  —  a'W^h  v  —  aWv 
=  rrifV  +  Csv 


(29) 


After  again  integrating  Fgte  and  the  left  hand  side  by  parts,  eqn  (28)  can  be 
written 


J  CSvSwdx—  —  J  Cvh8w dx  —  J  TZ{^h)8wdx  (30) 

where  Ci\  can  be  understood  as  a  grid-scale  linearization  of  TZ(ip^)  on  each 
Qe.  To  simplify  evaluation  of  £  and  7 Z,  we  consider  using  the  nonconservative 
approximations 


TZh  =  rht  +  f'  •  V'ljjh  -  a'Vtph  ■  Wi>h  -  b  (31) 

Chvh  =  m'tvh  +  f'  •  Vuh  -  a'V'iph  ■  Vvh  (32) 

where  coefficients  are  evaluated  pointwise  using  on  each  f2e  and  second  or¬ 
der  terms  are  dropped,  since  P 1  trial  functions  are  used.  Conservative  approx¬ 
imations  could  also  be  formulated,  but  as  the  approximations  above  are  simple 
and  do  not  affect  the  conservation  properties  of  the  grid-scale  discretization, 
we  use  them  for  this  work. 


4-3.2  Grid-scale  equation 

Returning  to  eqn  (24),  we  collect  terms  for  Vh  and  8v  and  write 


J  m!tvhwh  dx-  J  (f'vh  -  a'V4’h  vh  -  aVvh^j  ■  Vwh  dx 
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+  y  j  m't5vwh  d a;  —  y  y  (i'Sv  —  a'Viph  8v  —  aVSv^j  ■  Vwh  da; 
e  ne  e  (ie 

=  ~Fh 

Although  not  strictly  necessary,  we  neglect  the  effects  of  subgrid  scale  variation 
on  the  grid-scale  accumulation  term  (a  “static  subgrid  scales”  assumption  [38]) 
and  integrate  the  fourth  term  on  the  left  hand  side  by  parts  to  obtain 


/  dx- 1 (f “  a'^hVh  ~  aVVh) ' VWh  dI 
n  n 

+  y  f  £*swh8v  dx  =  - F h 


cie 


(33) 


where  boundary  terms 


y,  /  aVvjfr  •  n eSv  ds  (34) 

e  dne 

in  the  integration  by  parts  have  again  been  neglected  in  the  introduction  of 
the  formal  adjoint  of  Cs 

C*sw  =  —  (V  —  a'V^J  ■  Vic  —  V  •  (aVw)  (35) 

As  with  Chi  we  drop  second  order  terms  in  eqn  (35)  and  approximate  C*  as 


C*S)hwh  —  •  Vrc  —  a'Vipf >  •  Vw 


(36) 


To  obtain  a  system  in  terms  of  the  grid-scale  quantities  alone,  we  use  a  sim¬ 
ple  algebraic  subgrid  scale  (ASGS)  approximation  [16,  33]  and  again  neglect 
temporal  variation  in  the  subgrid  scale  Newton  correction  [38].  We  then  insert 


8v  «  -T  (chvh  +  nh ) 

into  eqn  (33)  to  obtain  a  linearized,  grid-scale  equation 


(37) 


j  rri'tvhwh  dx  -  J  (('vh  -  a'Viph  vk  -  aVvh)  ■  Vwh  dx 
ci  ci 

~  y  [  C.*s  hwhrChvh  dx  =  — y  +  y  /  C*s  hwhT-R^e  dx 


(38) 
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The  right  hand  side  of  eqn  (38)  is  just  —  Gh  from  eqn  (22).  The  left  hand  side 
contains  the  standard  conforming  Galerkin  terms  for  the  Newton  Jacobian  plus 
an  additional  term  arising  from  the  stabilization.  This  yields  a  quasi-Newton 
method,  since  the  linear  operator  on  the  left  hand  side  is  not  precisely  the 
Jacobian  of  Gh  due  to  the  linearization  of  the  stabilization  terms. 


4-3.3  Stabilization  parameter 


There  are  a  number  of  ways  to  define  the  stabilization  parameter  r.  Typically, 
these  are  motivated  by  analysis  of  linear  advection-diffusion-reaction  equa¬ 
tions  [33,  59].  For  the  ID  linear  advection-diffusion  equation  with  constant 
coefficients,  the  solution  of  the  subgrid  error  approximation  yields 


r  = 


he 

2a' 


LthDG 

2a'  ' 

L  V  y 

hef'\ 

(39) 


which  was  already  widely  used  in  earlier  stabilized  methods  [20].  Here,  we  use 
a  straightforward  multi-dimensional  approximation  of  r  on  De,  similar  to  that 
used  in  [33,  40,  59], 


r  = 


'jf'-q'wyA2 1  a/jNicoT 


hP 


hi 


where  ||  ||2  is  the  vector  2-norm  and  ||  lloo  is  the  matrix  oo-norm. 


(40) 


4-3-4  Shock  capturing  diffusion 

The  ASGS  approximation  above  does  not  in  general  ensure  that  no  spurious 
local  extrema  will  be  generated.  Particularly  for  nonlinear  problems,  steep 
gradients  in  the  solution  can  form  which  produce  small  undershoot  or  over¬ 
shoot  near  the  front.  In  such  cases,  it  is  common  to  include  a  shock  capturing 
numerical  diffusion  term  so  that  the  grid-scale  residual  equation  becomes 


Gh  =  Fh  — 


I  £*s,hWhTTlh('fh)  dx  +  J2  [  ■  Vw/i  dx 

e  He  6  fie 


—  0,  Vwh  G  Wh 


(41) 


There  are  several  different  approaches  for  defining  the  associated  numerical 
diffusion  parameter  [60],  but  here  we  simply  use  the  standard  isotropic  defini¬ 
tion  [60,  61]  for  each  Vte 
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(42) 


he  l^/ijel 

V  =  Vc~2 ~WM2 

where  vc  is  a  problem  dependent  parameter. 


4-4  Velocity  postprocessing  algorithms 


The  goal  of  our  postprocessing  is  to  obtain  a  velocity  held  <77  that  conserves 
mass  discretely  on  each  Qe,  e  =  1, ...  ,Ne,  has  continuous  normal  component 
across  each  7 f  G  T/,  and  satishes  the  boundary  condition  eqn  (7)  along  Tjv. 
To  do  this,  we  extend  the  techniques  of  Larson  and  Niklasson  [17]  and  Sun 
and  Wheeler  [18]  to  eqns  (8)  and  (9)  in  a  straightforward  manner. 


4-4-1  Larson-Niklasson  [17]  postprocessing 

Before  proceeding,  some  additional  notation  is  required.  The  set  of  elements 
sharing  node  xn,  or  its  star,  is  written  £(n).  The  number  of  elements  sharing 
xn  is  N*e  =  card(£(n)).  For  each  interior  face  7 /  =  dflg  fl  <9h2r,  we  write 
and  nr  for  the  unit  outer  normal  to  f \g  and  h2r,  respectively.  The  global 
element  identihers  for  Vtg  and  fir  are  also  written  eg(f)  and  er(f).  An  arbitrary 
choice  of  nj  =  rif  is  made  in  order  to  associate  a  unique  unit  normal  vector 
with  each  7 p.  The  set  of  faces  contained  in  dfle  that  are  in  r  /-  U  Y D  is  written 
JFj  d(e).  The  global  identihers  for  the  nodes  contained  in  a  face  7 f  are  denoted 
J\f(f).  A  local  numbering  of  elements  e  e*  for  f \  G  £(n)  is  also  associated 
with  each  node-star.  Figures  1  and  2  illustrate  our  notational  conventions  for 
a  two-dimensional  triangulation. 

Since  we  consider  only  piecewise- linear  approximations,  global  basis  functions 
are  naturally  associated  with  nodes  xn,  n  —  1, . . . ,  Nn.  For  each  node,  we  then 
write  the  corresponding  test  function  with  w/l(xn)  =  1  as  w^n  and  the  discrete 
residual  associated  with  each  e  G  £(n)  as 

Gh,n,e  =  j  rhtWh,n  dx  -  J  (f  -  dVlph)  ■  Vwh,n  dx  +  J  (TbWhtn  ds 

Qe  Qe  (9f2enrjv 

-  J  bwh)n  dx  -  J  C*s  hwKnTTZh(p[h)  dx 

+  /  oViph  ■  WwKn  dx  (43) 

Qe 

In  terms  of  the  local  numbering  on  £(n),  the  conservation  residual  associated 
with  Vte  is  similarly 
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£{n)  =  {8,9,17, 18,19} 
e*  —  {1,  5,  2,  3, 4} 


ee{f)  =  17,  er(f)  =  18 

eKf)  =  2>  K(f)  =  3 


Fig.  1.  Notation  example  for  interior  node,  xn 


n  i 

A 7(/)  =  {m,n2}  —  {22, 23} 


Fig.  2.  Notation  example  for  7^  e  T/  (left)  and  7^  €  (right) 


+  J2  [&hj  +  (t7n,e*(/)  -  UnKU))  n/]  wKn  ■  ne  ds 


where  is  the  average  velocity  accross  7 f  given  by 


(f  -  oVV>)  In  {/l  +  (f  -  aV«i)  |n,r(„ 
^ =  2 


(44) 


(45) 


and  where  e|(/)  and  e*(/)  refer  to  the  local  numbering  of  the  elements  neigh¬ 
boring  7 f,  and  Un,e*,  e*  =  1, . . .  ,N*e  are  the  piecewise  constant  corrections 
for  which  the  algorithm  solves. 

At  each  node,  we  then  have  a  system  of  N*  e  equations  from  (44) 


Rn(Un)  =  0 


(46) 
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where  R„  =  [Rn,  i, . . . ,  Rn,N*JT  and  Un  =  [Un^, ...,  U7hN*JT.  Solution  of  eqn 
(46)  corresponds  to  a  pure  Neumann  problem  on  £ (n)  when  xra  G  T/,  which  is 
unique  up  to  a  constant  [17].  To  handle  this  non-uniqueness,  we  simply  enforce 
Unj  =  0  for  nodes  in  Tj.  Note  that  this  arbitrary  condition  does  not  affect 
the  resulting  velocity,  which  is  unique  [17]. 

To  solve  eqn  (46),  we  first  compute  Rn(0)  which  is  just  the  conservation 
residual  for  <77.  Un  follows  from 

u„  =  -j;'R(0)  (47) 


Here 


T  _  dRn 

n’ij  da, 


i  =  2,.. 


n,j 


with  Jn  x  j  =  <5x  7  and  Rn  i  = 


• ,  K,e  and  j  =  1, . . . ,  K)£  (48) 

0  for  xn  G  T/  U  Tn  in  order  to  enforce  UnA  =  0. 


Eqn  (47)  is  just  a  single  step  of  Newton’s  method  for  eqn  (46),  which  suffices 
since  eqn  (46)  is  linear  in  Un  even  though  it  is  nonlinear  in  ^7.  Looking  at 
eqn  (44),  the  entries  of  J n  are  not  dependent  on  the  solution,  coefficients,  or 
boundary  conditions,  and  depend  only  on  the  computational  mesh  for  interior 
nodes.  For  nodes  on  <9h2,  J n  depends  on  boundary  condition  types  but  not 
actual  values.  The  local  systems  Jn,  n  =  1, ,  Nn  can  be  built  and  stored  in 
factored  form  and  only  need  updating  when  the  mesh  adapts  or  changes. 


Once  a  solution  for  Un.e* ,  e*  =  1, . . . ,  N*  for  each  n  =  1, . . . ,  Nn,  has  been 
obtained,  the  corrected  velocity  on  each  7/  €  T/  U  VD  is 


CTh,f  =  (ThJ+  Yl  (Un,e*t{f)-UnKU))nfwKn  (49) 

neAf(f) 

with  omitted  when  7 y  e  To.  We  set  &hj  =  crbu.f  for  7/  G  Tat. 

&hj  -  n/  is  continuous  across  7 /  by  construction.  To  see  that  it  is  locally 
conservative,  note  that  Rne  =  0  for  all  n  and  e,  and  consider  Vte  with  dCle  0 
Tjv  =  0 


0=  E  Rn,e 
n£j\f(e) 

=  E  Gh,n,e 

n£j\f(e) 

+  E  E  /I  &h,f  T  ( 'Unte*(f )  Unte*^f^j  njJ  Wfi^n  •  ne  ds 

n£Af (e)  feJ7i,d(e)yf 


(50) 
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(51) 


Since  the  test  functions  are  a  partition  of  unity,  it  follows  that 


0  =  j  ( rht  —  b )  ckc 


+  X]  /  (Th’f  +  X]  (Un,e*t{f)  ^n,e*(/))  n fWh,n 
f^i,d(ehf  L  n€Af(e) 

=  /  ( mt  -b)  da;  +  ^  /  <7ftj  •  ne  ds 

L  /e^,d(e)7/ 


•  ne  ds 


(52) 


which  gives  the  desired  mass  conservation  statement.  For  elements  on  the 
Neumann  boundary  conservation  follows  likewise  since  by  j  •  ne  =  ab. 

After  obtaining  byj  on  each  yj,  /  =  1  we  obtain  a  representation 

over  all  of  hi  by  projecting  byj  onto  a  local  RTO  velocity  space  [52] 


Vh(Qe)  =  [P°(Qe)]"d  ©  xP°(f2e)  (53) 

Since  crhj  is  piecewise  linear  with  continuous  normal  component,  the  projec¬ 
tion  onto  V /,(h2e)  is  well  defined  and  the  resulting  global  velocity  held  is  in 
P(cliv,  hi).  An  RTO  representation  is  particularly  convenient,  since  the  normal 
flux  through  element  faces 


/  *h,f  ■  n /  ds 

7/ 

can  be  used  as  the  degree  of  freedom  along  with  the  local  basis  [9] 


(54) 


Ne  i .  = _ — 

e’lf  nd\Qe 


(x  -  xn7/)  ,  if  =  l,...,nd+l 


(55) 


where  if  is  a  local  identiher  on  for  yj  and  x„7/  is  the  node  across  from  face 

V- 


In  fact,  one  could  also  project  &h.f  onto  the  linear  Brezzi-Douglas-Duran- 
Fortin  (BDDF1)  space  [62]  (the  Brezzi-Douglas-Marini  space  for  nd  =  2), 
since  cfy/  is  piecewise  linear.  However,  this  requires  more  storage  (nd  +  nd 
versus  nd  +  1  local  degrees  of  freedom),  while  only  first  order  accuracy  can  be 
expected  from  &h  in  general  [17]. 
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4-4-2  Sun-Wheeler  [18]  postprocessing 

Although  its  original  context  is  more  general,  we  restate  the  Gauss-Seidel  algo¬ 
rithm  from  [18]  for  the  applications  of  interest  here,  CG  solutions  for  Richards’ 
equation.  The  notational  conventions  differ  from  the  original  presentation  as 
well  in  order  to  be  more  consistent  with  the  formulations  above. 

The  Sun- Wheeler  Gauss-Seidel  algorithm  assumes  that  an  initial,  well-defined 
velocity  field  &hj  is  available  and  that  &h,f  is  globally  conservative.  It  then 
relies  on  constant  element  boundary  corrections,  {AC//},  /  =  1  For 

given  &hj  and  {AC//},  we  define  the  element  conservation  residual 


Re  /  ]  Gh  n  e  T  /  ]  I  {&h,f  T  AUfiif)  •  ne  ds, 

ne  A/"(e)  7  fGdfle-yj 

e  =  1, . . . ,  Ne  (56) 

and  its  average  Re  =  Re/\Q,e\.  The  assumption  of  global  conservation  implies 
that  no  correction  is  necessary  for  7 /  G  dfl,  so  we  introduce  the  additional 
notation  AC/,  for  the  number  of  interior  element  boundaries  and  a  reordering 
for  convenience  so  that  7 /  G  Tj  for  1  <  /  <  Nft,  7 /  G  dtt  for  AC/.  <  /  <  AC/, 
and  AC//  =  0  for  /  >  AC/.. 


The  algorithm  is  build  around  the  idea  of  minimizing  the  L2  norm  of  R 


/ 


1/2 


R2  (lx 


(57) 


with  respect  to  the  element  boundary  correction  {AC//}.  Here  R  is  the  piece- 
wise  constant  function  with  R\ne  =  Re-  This  is  a  linear  least  squares  problem, 
the  direct  solution  of  which  requires  the  solution  of  a  globally  coupled  linear 
system. 

To  localize  the  computation,  a  series  of  corrections  that  are  nonzero  on  only 
a  single  element  boundary  are  defined 

AUp  =  CfSff,  !</,/'<%  (58) 


where  c/  is  chosen  to  minimize  eqn  (57),  which  leads  to 

_  |G/| Rr  —  \Qr\Ri 
c,[  |7/i(|fiii  +  ia.|) 


(59) 


where  Q;  and  are  the  elements  sharing  face  /,  n/  is  directed  from  to 
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Qr,  and  R  is  the  piecewise  constant  function  with  R\ne  =  Re.  After  a  face  cor¬ 
rection  is  computed,  the  affected  element  conservation  residuals  are  updated 
in  the  manner  of  the  classical  Gauss-Seidel  iteration  for  linear  systems.  The 
algorithm  is  not  however  equivalent  to  Gauss-Seidel  for  the  global  linear  least 
squares  problem. 


To  be  concrete,  the  Sun- Wheeler  iteration  process  is  given  in  Algorithm  1. 
In  Algorithm  1,  km  is  the  maximum  number  of  iterations  allowed,  emc  is  the 
mass  conservation  tolerance,  and  /  G  kFi(e)  corresponds  to  the  set  of  interior 
element  boundaries  in  d kle. 


Algorithm  1  Sun- Wheeler  Gauss-Seidel  algorithm  [18] 


Require:  crh  f  that  is  globally  conservative 

1:  k  —  0.  AUj’Nfi  =  0,  /  =  1  Re’Nfi  defined  from  eqn  (56),  e 

!,•••,  Ne. 


while  k  <  km  and  max,  \R„ 


k,Nf 


<  er 


do 


Rk'°  =  K 


,  e  =  1, . . . ,  Ne. 


AUf’°  =  AUf~1,Nfi,  f  =  1, . . .  ,Nfi. 


for  /  =  1, . . . ,  Nfi  do 

A uy  =  A uy-1  +  C/(RU-')S,r,  /'  =  1 . JVA. 

Ri‘  =  Ri’-1  +  E/'efiW  fv  cAR^Sjrnr  .  n,  d>,  /'  =  l, . . . ,  N,„ 

e  —  1, . . . ,  Ne. 

8:  end  for 

9:  end  while 


Note  that  the  original  presentation  of  the  algorithm  in  [18]  was  written  in 
terms  of  R  and  a  different  sign  convention. 

At  the  algorithm  conclusion,  we  have  a  corrected  velocity  field  defined  on 
element  boundaries 


crhj  =  trh,f  +  A  U fnf  (60) 

that  can  be  extended  to  element  interiors  through  the  projection  process  onto 
local  RTO  spaces  as  in  eqn  (54)  above.  In  general,  higher  order  mixed  spaces 
can  be  used  to  define  local  Neumann  problems  to  meet  higher  order  compati¬ 
bility  conditions  [18,  47]. 

A  precondition  for  the  Sun- Wheeler  algorithm  is  a  globally  conservative  veloc¬ 
ity  held  defined  on  element  boundaries,  &h,f  ■  One  established  strategy  for  ob¬ 
taining  &hj  is  an  initial  postprocessing  that  requires  a  projection  over  G  TD 
[22],  Another  approach  is  to  enforce  Dirichlet  boundary  conditions  weakly  in 
the  original  finite  element  solution  as  in  [25,  63].  We  take  the  latter  approach 
here.  Specifically,  when  the  Sun- Wheeler  Gauss-Seidel  postprocessing  is  used, 
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we  modify  the  weak  formulation  in  eqn  (20)  by  reintroducing  trial  functions 
corresponding  to  T ^  and  adding  the  boundary  integral 


/ 


(f  -  aV"0)  ■  n  +  ap(i>h  - 


r 


ds 


(61) 


where  in  this  work  ap—l/h. 


4-5  Solution  methods 

4-5.1  Nonlinear  and  linear  solvers 

The  finite  element  method  above  leads  to  a  discrete  nonlinear  system  of  equa¬ 
tions  that  must  be  solved  at  each  time  step.  These  nonlinear  problems  are 
usually  large  for  two  and  three-dimensional  domains  and  difficult  to  solve. 
Variants  of  Picard  iteration  along  with  Newton  or  quasi-Newton  methods  have 
all  been  considered  for  Richards’  equation  [4,  5,  64],  However,  no  method  has 
yet  proven  universally  successful  in  our  opinion. 

The  numerical  experiments  below  are  performed  on  a  hierarchy  of  uniformly 
refined  meshes,  Tj£,  i  —  1, . . . ,  Nm.  Here,  we  use  a  standard  Newton  iteration 
with  Armijo  line  search  [65]  on  each  level.  For  the  steady-state  examples, 
nested  iteration  (NLNI)  [66,  67]  is  used  to  generate  the  initial  Newton  iterate 
on  each  level,  to  speed  convergence.  The  resulting  linear  systems  on  each  level 
are  solved  using  the  sparse  direct  SuperLU  solver  [68]  for  simplicity,  since  the 
focus  of  this  work  is  the  finite  element  formulation  and  velocity  postprocessing. 
Other  multilevel  techniques  and  globalization  strategies  are  certainly  possible 
and  can  be  advantageous  in  some  cases  [4,  67]. 


4-5.2  Local  postprocessing  systems 

The  postprocessing  algorithm  presented  in  §4.4.1  requires  solution  of  Nn  in¬ 
dependent  linear  systems  of  size  N*e  x  N*e.  To  solve  eqn  (47),  we  use  the 
LAPACK  LU  routines  dgetrf  and  dgetrs.  The  node-star  systems  are  refac¬ 
tored  only  when  the  mesh  and/or  T jy  changes. 


4-6  Time  integration 


For  simplicity,  we  restrict  the  temporal  approximation  order  to  one  in  the 
transient  test  problem  below,  since  the  focus  of  this  work  is  on  the  spatial 
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discretization  and  velocity  postprocessing  schemes.  To  control  error,  we  require 
at  each  tn+l  that 


\m 


p,n+ 1 
h 


m 


n+ 1 1 


<  er\\m 


n+ 1 1 


+ 


(62) 


where  m^n+1  is  a  hrst  order  predictor  for  the  solution  at  tn+1  computed  by 
extrapolating  from  mvh  and  m^_1.  A  classical  step-size  controller  is  used  for 
time  step  selection  with  maximum  increase  and  decrease  factors  of  two  and 
one-tenth,  respectively  [69,  p.  168].  For  stabilized  CG  approximations,  the 
evaluation  of  r  and  vc  is  lagged  a  time  step  after  an  initial  startup  phase  in 
an  effort  to  simplify  the  nonlinear  solves  [70].  At  is  decreased  by  a  factor  of 
ten  when  a  nonlinear  solver  failure  is  encountered. 


5  Results 


The  finite  element  formulations  given  above  admit  many  variations.  In  the 
following  section,  we  present  a  series  of  numerical  experiments  designed  to 
verify  the  basic  methodology  and  investigate  the  relative  performance  of  some 
of  these  variations.  We  consider  a  standard  CG  approximation  as  well  as  a 
multiscale-stabilized  approach  with  shockcapturing  (CG-S).  Abbreviations  for 
the  methods  used  are  given  in  Table  1. 

For  comparison,  we  also  include  results  obtained  with  the  P 1  nonconforming 
(NC)  approach  from  [53]  as  discussed  in  §2.4.  The  NC  approximation  yields 
velocity  fields  equivalent  to  an  RTO  approximation  for  the  problems  consid¬ 
ered  below  with  the  advantage  of  a  simple  implementation  within  the  same 
computational  framework  as  the  CG  methods  considered.  The  primary  draw¬ 
back  is  that  a  piecewise  constant,  average  approximation  for  source  terms  and 
material  coefficients  is  necessary  for  the  nonconforming  solution  while  this  is 
not  required  for  the  CG  solutions. 

We  combine  the  CG  methods  from  Table  1  with  different  approaches  for  ob¬ 
taining  a  velocity  held,  &h.  We  denote  the  combined  approach  as  (X-*)  where 
X  is  one  of  CG,CG-S,  or  CG-V  and  the  suffix  depends  on  the  evaluation 
of  <t/j.  We  allow  for  pointwise  evaluation  of  by,  via  eqn  (9)  (PE)  as  well  as 
the  Larson-Niklasson  postprocessing  algorithm  from  §4.4.1  (LN)  and  the  Sun- 
Wheeler  algorithm  from  §4.4.2  (SW)  with  local  RTO  representations. 
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Table  1 

Summary  for  FEMs  compared  in  numerical  experiments 
Abbrev.  Definition 

CG  conforming  Galerkin  approximation,  eqn  (41)  with  r  =  0,  vc  =  0 

CG-S  multiscale  stabilized  CG  with  shockcapturing,  eqn  (41),  vc  =  0.1^ 

CG-V  lumped  CG  approximation  with  vertex  quadrature,  t  =  0,  vc  =  0 
NC  P1  nonconforming  approach  [53] 

f  vc  =  0.5  for  Problem  V. 

5.1  Test  problems 

Five  example  problems  were  selected  for  the  numerical  experiments.  The  ma¬ 
jority  are  steady-state,  since  the  emphasis  of  this  work  is  on  spatial  approxi¬ 
mation  techniques.  The  first  two  problems  are  linear  with  a  known  analytical 
solution,  which  facilitates  verification  of  the  numerical  model  and  evaluation 
of  the  methods’  performance  for  cases  where  the  solution  is  smooth.  The  third 
example  is  a  steady-state  recharge  problem  in  a  homogeneous  domain  and 
tests  each  method’s  ability  to  resolve  fronts,  since  the  solution  contains  an 
internal  layer.  The  fourth  problem  consists  of  constant  recharge  in  a  block 
heterogeneous  domain,  while  the  fifth  example  considers  transient  infiltration 
into  the  same  domain. 


5.1.1  Problems  I  and  II 

The  first  two  test  problems  are  for  steady-state,  fully  saturated  flow  in  two 
(Problem  I)  and  three-dimensional  domains  (Problem  II).  Sinusoidal  and  poly¬ 
nomial  heterogeneity  distributions  are  used  in  order  to  allow  calculation  of 
analytical  solutions  in  two  and  three  space  dimensions,  respectively.  To  be 
specific,  we  set  41  =  [0,  T\nd  and 

■0(x)  =  sin2(27nci)  +  cos2(27nr2)  +  sin2(27nr3)  +  X\  +  x2  +  x3  +  5 
iCs^(x)  =  (5  +  x?)8ij  (63) 

for  rid  =  2  and 

V>(x)  =  5>2 

i= 1 

-fCw(x)  =  (5  +  x^Xi+1)5ij  (64) 
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for  rid  —  3.  In  eqn  (64)  Xj+i  is  taken  modulo  3.  For  simplicity,  the  gravity  term 
is  ignored.  The  source  term,  6(x),  and  boundary  conditions  are  calculated  from 
eqn  (1)  assuming  eqns  (63)  and  (64).  For  Problem  II,  Dirichlet  conditions  are 
set  everywhere  except  the  face  =  0.  Dirichlet  conditions  are  set  everywhere 
except  the  segment  X2  =  1,  for  Problem  I. 


5.1.2  Problem  III 

The  third  example  involves  steady-state  variably  saturated  groundwater  flow 
assuming  constant  recharge,  crb  =  —2  x  ICE2  [m/d]  into  a  two-dimensional 
homogeneous  domain,  fl  =  [0,1]  x  [0,5]  [m2].  The  medium  corresponds  to 
the  “Sand”  test  problem  used  in  [71,  see  Table  1]  as  well  as  [3,  7].  To  be 
concrete,  VGM  p-s-k  relations  are  used  with  nvg  =  4.26,  and  avg  =  5.47, 
while  9S  =  0.301,  9r  =  0.093,  and  the  saturated  conductivity  is  isotropic  with 
Ks  =  5.04  [m/d].  We  neglect  compressibility  of  the  aqueous  phase,  (3C  =  0  for 
simplicity,  however. 

A  fixed  pressure  head,  ip  —  1  [m]  is  specified  on  the  domain  bottom  and  no 
flow  along  the  vertical  boundaries.  A  constant  value  of  ip  =  1  [m]  is  also  used 
as  the  initial  guess  for  the  Newton  solve.  The  resulting  solution  contains  a 
steep  transition  zone  and  a  constant  velocity  cr  =  [0 ,crb]T  everywhere.  Even 
though  the  solution  is  essentially  one-dimensional,  it  provides  a  reasonable 
test  for  the  ability  of  the  finite  element  methods  to  resolve  the  boundary  layer 
and  maintain  symmetry  in  the  velocity  held. 


5.1.3  Problems  IV  and  V 

The  fourth  and  fifth  examples  are  based  on  the  second  test  problem  from  [10]. 
VGM  p-s-k  relations  are  again  used  with  four  separate  media  types  configured 
in  a  simple  block  pattern.  The  domain  consists  of  two  shallow  layers  near  the 
surface  and  a  third,  larger  region  with  a  small  high  conductivity  sub-block.  The 
media  properties  and  domains  are  summarized  in  Table  2.  To  be  consistent 

with  [10],  we  set  f3c  =  0  as  well.  The  physical  domain  is  O  =  [0,8]  x  [0,6.5] 
2 

Problem  IV  is  a  steady-state  example.  A  constant  recharge  of  ab  =  -2xl0”2 
[m/d]  is  set  on  the  top  left  2.25  meters  of  the  domain.  No  how  boundaries  are 
set  along  the  remainder  of  the  top  boundary,  as  well  as  the  left  boundary.  Hy¬ 
drostatic  conditions  are  set  on  the  right  boundary  with  a  water  table  elevation 
of  2.17  [m]. 

Problem  V  considers  transient  infiltration  and  has  no  how  boundary  conditions 
along  the  right  boundary  rather  than  hxed  ip.  We  set  the  initial  condition  to 
ip°  =  -89.96  [m]. 
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Table  2 

Media  properties  for  Problems  IV  and  V 


Type 

Ks  [m/d] 

M-] 

0r  [-] 

av  [1/m] 

nv  [-] 

Location 

1 

7.89 

0.3680 

0.1020 

3.334 

1.982 

[0,8]  x  [6.0,  6.5] 

2 

4.69 

0.3510 

0.09849 

3.63 

1.632 

[0,8]  x  [5.5,  6.0] 

3 

4.143 

0.3250 

0.08590 

3.455 

5 

fit 

4 

41.143 

0.3250 

0.08590 

3.455 

5 

[1,3]  x  [4,5] 

f  medium  type  is  3  by  default 
5.2  Error  and  work  measures 


To  measure  accuracy  of  numerical  approximations  for  if;  and  <r,  we  use  relative 
error  in  discrete  Lp  norms 


£-U,p 


IK  -  u\ 


U 


(65) 


where  u  is  generic  variable  placeholder  and  p  =  1,2,  or  oo.  u  is  the  analytical 
solution  when  available  or  a  reference  numerical  solution  otherwise.  When 
analytical  solutions  are  not  available,  the  discrete  solution  on  the  finest  mesh  in 
the  multilevel  hierarchy  is  used  as  the  reference  solution  and  coarser  solutions 
are  projected  onto  the  finest  mesh  in  order  to  approximate  error.  This  was 
judged  to  give  an  acceptable  measure  of  solution  quality  even  though  error 
in  the  reference  solution  is  nonnegligible  for  Problems  IV  and  V  given  their 
heterogeneous,  nonlinear  character  [11], 

Mass  conservation  error  is  recorded  as 


£mc  =  max 

'  le 


J  rht  dx  +  J 

f2  e  9Qe 


&h  ■  neds  —  /  bdx 


ne 


(66) 


where  <Jyl  is  a  velocity  held  obtained  from  one  of  the  postprocessing  methods 
described  above:  PE,  LN,  SW,  or  the  NC  approximation. 

There  are  several  ways  to  measure  computational  effort,  including  number  of 
degrees  freedom,  Nd0f  as  well  as  total  CPU  time  [72],  We  rely  primarily  on 
total  degrees  of  freedom  here,  since  the  majority  of  the  computations  for  the 
methods  compared  are  implemented  within  the  same  general-purpose  finite 
element  library.  The  sparsity  pattern  of  the  NC  approximation  can  affect  per¬ 
formance  [73]  in  general,  but  the  impact  on  SuperLU  is  minimal  beyond  the 
difference  in  total  degrees  of  freedom.  Since  the  LN  postprocessing  algorithm 
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requires  building  and  factoring  local  systems  for  each  £(n),  we  report  the  ap¬ 
proximate  CPU  overhead  for  each  mesh  as  well,  where  the  CPU  times  are 
averaged  over  at  least  1000  repetitions.  For  the  SW  algorithm,  we  report  the 
CPU  time  required  to  reach  convergence  or  exhaust  the  allowed  number  of 
iterations  for  each  simulation. 


5.3  Implementation  details 


Element  integrals  in  Problem  I  and  II  were  approximated  using  Gaussian 
quadrature  that  is  exact  for  quartics  and  cubics,  respectively.  Fourth-order 
Gaussian  quadrature  was  again  used  for  the  CG  approximations  in  Prob¬ 
lem  III,  while  only  third-order  quadrature  was  used  for  the  nonlinear,  hetero¬ 
geneous  examples.  Second-order  quadrature  with  points  taken  from  element 
boundary  barycenters  was  used  for  the  NC  approximations  in  the  nonlinear 
examples.  The  nonlinear  solver  employed  a  relative  residual  convergence  test. 
For  Problem  III,  absolute  and  relative  tolerances  of  eni>a  =  1  x  10-8  and 
£ni,r  —  h  x  1CU3,  where  h  is  the  mesh  diameter  were  used.  For  Problems  IV 
and  V,  enita  =  1  x  10~6,  while  eni:T  —  h  x  1CT6.  The  absolute  and  relative 
temporal  integration  tolerances  were  set  to  ea  =  er  =  1CU3  for  Problem  V.  A 
convergence  tolerance  of  emc  =  10~6  and  was  used  for  the  SW  postprocessing 
in  all  cases  and  the  maximum  number  of  iterations  was  set  kmax  =  10000. 

The  computations  were  performed  using  a  finite  element  library  under  devel¬ 
opment  by  researchers  at  the  US  Army  Corps  of  Engineers,  Clemson  Univer¬ 
sity,  North  Carolina  State  University,  the  University  of  Texas  at  Austin,  and 
Applied  Research  Associates,  Inc.  High  level  routines  and  methods  were  im¬ 
plemented  in  Python,  while  low-level  computationally  intensive  portions  were 
in  c  and  F77.  Simulations  were  run  on  a  3  GHz  duo  core  Intel  Xeon  Mac  Pro 
desktop  with  8  GB  memory.  Python  2.5,  gcc  4.0.1,  and  g77  3.4.0  were  used 
to  interpret  and  compile  the  code. 


5-4  Simulation  Results 
5. 4-1  Problems  I  and  II 

Tables  3  and  5  record  the  accuracy  of  the  CG  and  NC  approximations  for 
^  on  uniformly  refined  meshes  for  Problem  I  and  Problem  II,  respectively, 
while  Tables  4  and  6  compare  accuracy  and  local  mass  conservation  for  the 
different  velocity  postprocessing  techniques.  The  CPU  overhead  for  building 
and  factoring  the  local  node-star  systems,  eqn  (46),  on  each  mesh  are  reported 
in  Table  7. 
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Stabilization  and  vertex  quadrature  were  not  considered  for  Problems  I  and 
II,  since  the  solutions  were  smooth.  Looking  at  Tables  3  and  5,  both  CG 
and  NC  methods  obtained  second-order  convergence  for  ip,  as  expected.  For 
Problem  I,  the  accuracy  of  the  NC  solution  was  clearly  limited  by  the  use  of 
piecewise  constant  average  values  for  b,  since  the  CG  and  NC  error  values  were 
nearly  identical,  even  though  Nrj0f  for  the  NC  approximation  was  significantly 
higher.  For  Problem  II,  the  NC  errors  were  significantly  lower  for  a  given  level 
of  refinement,  but  the  CG  accuracy  was  slightly  better  per  degree  of  freedom. 

Global  accuracy  of  the  velocity  postprocessing  methods  was  similar  for  Prob¬ 
lem  I.  In  particular,  point-wise  evaluation  was  first-order  accurate  even  though 
local  mass  conservation  failed.  For  Problem  II  the  CG-PE  approximation  was 
again  Erst  order  accurate,  but  the  actual  error  values  were  two  to  three  times 
higher.  The  accuracy  of  the  LN  and  SW  velocity  fields  was  essentially  the 
same.  On  each  level,  the  NC  approximation  error  for  &h  was  significantly 
lower  than  that  of  the  LN  and  SW  approximations,  but  the  postprocessing 
algorithms  accuracy  was  competitive  if  not  better  in  terms  of  accuracy  per 
degree  of  freedom.  The  costs  in  CPLI  time  clearly  increased  for  n(j  =  3  due  to 
the  increased  nodal  and  element  boundary  connectivity. 


Table  3 

£,^2,  Problem  I 


Method 

Level 

h 

Ndof 

eb,2 

Rate 

CG 

1 

0.354 

25 

1.23x  10-2 

- 

NC 

1 

0.354 

56 

5.76x  10-2 

- 

CG 

2 

0.177 

81 

1.520x  10-2 

-0.225 

NC 

2 

0.177 

208 

1.51xl0-2 

1.85 

CG 

3 

0.0884 

289 

3.97xl0-3 

1.93 

NC 

3 

0.0884 

800 

4.14x  10-3 

1.94 

CG 

4 

0.0442 

1089 

l.OlxlO-3 

1.98 

NC 

4 

0.0442 

3136 

1.05x  10-3 

1.98 

CG 

5 

0.0221 

4225 

2.52x  10-4 

2.00 

NC 

5 

0.0221 

12416 

2.64x  10-4 

1.99 

5-4-2  Problem  III 

Figure  3  compares  approximate  solutions  for  m  from  Problem  III  after  uni¬ 
formly  refining  an  initial  eight  element  mesh  four  times.  The  resulting  mesh 
was  still  fairly  coarse,  with  512  elements,  800  edges,  and  289  nodes.  For  simplic¬ 
ity,  the  NC  approximation  is  shown  using  nodal  values  obtained  by  averaging 
the  corresponding  vertex  values  from  neighboring  elements.  Accuracy  of  the 
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Table  4 


ea‘2,  and  emc,  Problem  I 


Method 

Level 

£<r,  2 

rate 

J 

&mc 

CG-PE 

1 

4.44X10”1 

- 

1.30X101 

CG-LN 

1 

7.60X10”1 

- 

0 

CG-SW 

1 

8.30X10”1 

- 

9.94x  10-7 

NC 

1 

6.98X10”1 

- 

0 

CG-PE 

2 

4.25x  10”1 

-8.84x  10_1 

6.54x10° 

CG-LN 

2 

4.21X10”1 

6.32X10”1 

0 

CG-SW 

2 

4.56X10-1 

8.63xl0_1 

9.94x  10-7 

NC 

2 

4.16X10"1 

7.45x  10_1 

0 

CG-PE 

3 

2.19X10”1 

9.58X10”1 

2.07x10° 

CG-LN 

3 

2.19X10”1 

9.39X10"1 

0 

CG-SW 

3 

2.22X10”1 

1.04x10° 

9.94x  10-7 

NC 

3 

2.17X10”1 

9.37xl0_1 

0 

CG-PE 

4 

l.llxlO-1 

9.89X10"1 

6.28x  10_1 

CG-LN 

4 

l.lOxlO”1 

9.84X10'1 

0 

CG-SW 

4 

l.llxlO-1 

1.00x10° 

9.99x  10_1 

NC 

4 

l.lOxlO”1 

9.81xl0_1 

0 

CG-PE 

5 

5.54xl0"2 

9.97X10”1 

2.05x  10_1 

CG-LN 

5 

5.53x  10~2 

9.96xl0_1 

0 

CG-SW 

5 

5.54x  10~2 

1.00x10° 

l.OxlO-6 

NC 

5 

5.53x  10-2 

9.95X10”1 

0 

f  0  means 

<1x10 

,-12 

CG  and  NC  approximations  is  summarized  in  Table  8,  where  £ai;OC  and  £ait00 
are  discrete  L ^  errors  for  the  transverse  and  longitudinal  velocity  coordinates, 
respectively.  £^;00  was  estimated  using  the  a  reference  solution  obtained  from 
seven  levels  of  refinement  (Ndof  =  16641  )  in  the  case  of  the  CG  methods,  and 
six  levels  of  refinement  for  the  NC  solution  (Ndof  =  12416).  Table  9  gives  the 
CPU  effort  for  the  velocity  postprocessing  schemes. 

From  Table  8,  it  is  clear  that  the  LN  technique  conserved  mass  up  to  the  non¬ 
linear  solver  error,  regardless  of  the  CG  variation  with  which  it  was  combined. 
Similarly,  the  SW  algorithm  conserved  mass  up  to  the  imposed  conservation 
tolerance,  emc  =  10-6.  The  NC  velocity  approximation  was  essentially  exact 
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Table  5 


2  j  Problem  II 


Method 

Level 

h  Ndof 

£-0,2 

Rate 

CG 

1 

0.433  125 

2.75x10' 

-2 

NC 

1 

0.433  864 

3.63x10' 

-3 

CG 

2 

0.217  729 

6.87x10' 

'3  2.00 

NC 

2 

0.217  6528 

8.85x10' 

'4  2.05 

CG 

3 

0.108  4913 

1.72x10' 

'3  2.00 

NC 

3 

0.108  50688  2.17x10' 

'4  2.03 

Table  6 

£<j;2  and  £mc,  Problem  II 

Method 

Level 

£<r,  2 

rate 

et 

&mc 

CG-PE 

1 

1.22X10"1 

- 

2.87xl0_1 

CG-LN 

1 

6.38X10"2 

- 

0 

CG-SW 

1 

5.80X10"2 

- 

9.90x  10-7 

NC 

1 

1.29x  10~2 

- 

0 

CG-PE 

2 

6.09X10"2 

1.01x10° 

8.27xl0“2 

CG-LN 

2 

2.74xl0“2 

1.22x10° 

0 

CG-SW 

2 

2.95x  10”2 

9.75x  10_1 

9.97xl0-7 

NC 

2 

6.69x  10~3 

9.50x  10_1 

0 

CG-PE 

3 

3.04xl0"2 

1.00x10° 

2.21xl0-2 

CG-LN 

3 

1.19xl0~~2 

1.20x10° 

0 

CG-SW 

3 

1.39x  10~~2 

1.09x10° 

l.OOx  10-6 

NC 

3 

3.39x  10~~3 

9.82x  10_1 

0 

f  0  means 

<1x10 

r12 

up  to  the  nonlinear  solver  tolerance,  while  the  error  in  both  postprocessing 
schemes  was  significantly  lower  than  the  error  in  the  directly  evaluated  ve¬ 
locity  held  (CG-PE).  The  LN  approximations  were  more  accurate  than  SW 
solutions  on  the  level  four  mesh,  and  the  CG-V  velocities  were  more  accu¬ 
rate  than  those  obtained  from  the  CG-S  approximation,  apparently  dne  to 
excessive  numerical  diffusion  from  the  shock  capturing  term. 

Comparing  error  values  for  ip  and  the  solution  profiles  from  Figure  3,  £^,i00 
corresponds  to  undershoot  at  the  front.  The  lumped  (CG-V)  and  consistent 
(CG)  approximations  incur  the  most  undershoot.  The  multiscale  stabilized 
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Table  7 

CPU  overhead  for  velocity  postprocessing  algorithms,  Problems  I  and  II 


Problem 

Level 

Nn 

Nf 

LN  CPUf  [s] 

SW  CPU*  [s] 

I 

1 

25 

56 

0 

0  (83) 

I 

2 

81 

208 

0 

1.67xl0-2  (342) 

I 

3 

289 

800 

2.15X10"3 

1.17X10"1  (907) 

I 

4 

1089 

3136 

6.47xl0-3 

9.67xl0_1  (1687) 

I 

5 

4225 

12416 

3.22x  10-2 

1.78x10°  (831) 

II 

1 

125 

864 

2.41x  10-3 

3.33xl0-2  (243) 

II 

2 

729 

6528 

2.45x  10-2 

3.83X10”1  (466) 

II 

3 

4913 

50688 

2.65x  10_1 

4.23x10°  (484) 

f  0  means  <1x10  3  [s] 

*  number  of  iterations  in  parentheses 


approach  with  shock-capturing  was  the  only  method  able  to  resolve  the  solu¬ 
tion  monotonically  on  the  coarser  meshes.  As  one  would  expect,  the  relative 
advantage  of  the  CG-S  approach  decreased  as  the  mesh  was  refined  and  the 
other  methods  were  then  able  to  resolve  the  internal  layer.  For  instance,  the 
NC  solution  did  not  fully  avoid  undershoot  until  the  fifth  level  of  refinement 
although  the  nodal  averages  shown  in  3  are  clearly  more  accurate  than  the 
CG-S  solution. 


Table  8 

Problem  III  solution  error 


Method 

Level' 

Ndof 

£i/>,oo 

Smc 

S<7\  , OO 

£<j  2,00 

CG-PE 

4 

289 

8.72xl0-2 

2.26xl0-3 

4.40X10”3 

2.19x10° 

CG-LN 

4 

289 

8.72xl0~2 

1.35xl0-8 

4.37xl0-4 

1.89X10'1 

CG-S-LN 

4 

289 

4.36xl0-2 

3.00xl0_7 

3.90X10-4 

3.65X10'1 

CG-S-SW 

4 

289 

4.36xl0-2 

9.97xl0-7 

2.10xl0~3 

6.91xl0_1 

CG-V-LN 

4 

289 

1.75xl0_1 

9.33xl0-9 

2.29xl0-4 

9.84xl0~2 

CG-V-SW 

4 

289 

1.75xl0_1 

9.93xl0-7 

6.99xl0~4 

2.49X10”1 

NC 

4 

800 

9.49xl0~2 

0 

5.21xl0-8 

1.66xl0-4 

CG-PE 

5 

1089 

3.28xl0-2 

7.17xl0-4 

2.22X10”3 

1.61x10° 

CG-LN 

5 

1089 

3.28xl0~2 

6.10xl0_8 

3.29xl0-4 

1.71X10"1 

CG-S-LN 

5 

1089 

2.30xl0-2 

8.88xl0-8 

3.39xl0~4 

3.58X10-1 

CG-S-SW 

5 

1089 

2.30xl0-2 

9.97xl0-7 

l.llxlO-3 

3.98X10”1 

CG-V-LN 

5 

1089 

5.58xl0-2 

4.92xl0-9 

2.54xl0~4 

1.12x10-! 

CG-V-SW 

5 

1089 

5.58X10"2 

9.98xl0-7 

4.99xl0~4 

1.45x10-! 

NC 

5 

3136 

2.32xl0-2 

0 

2.71xl0-7 

1.55x10-® 

f  h  =  0.319  on  level  4,  and  h  =  0.160  on  level  5 
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Table  9 

CPU  overhead  for  velocity  postprocessing  algorithms,  Problem  III 


Method 

Level 

Its. 

CPU  [s] 

LN 

4 

- 

1.35xl0-3 

CG-V-SW 

4 

250 

3.33x  10-2 

CG-S-SW 

4 

595 

6.67xl0-2 

LN 

5 

- 

5.52xl0-3 

CG-V-SW 

5 

83 

3.33xl0-2 

CG-S-SW 

5 

207 

1.17xl0_1 

Fig.  3.  Profile  of  m  along  x  =  0.5 [m]  on  mesh  level  4 


5-4-3  Problem  IV 

Figure  4  illustrates  the  problem  domain  and  initial  mesh  generated  by  Triangle 
[74]  for  Problem  IV  and  V.  Performance  of  the  methods  for  Problem  IV  is 
summarized  in  Table  10.  CPU  overhead  for  the  LN  and  SW  algorithms  is 
shown  in  Table  11.  NLNI  failed  with  the  CG  approximation  for  Problem  IV, 
due  to  failure  in  the  Newton  solve  on  coarser  levels  in  the  mesh  hierarchy. 
Instead,  the  CG  results  in  Table  10  were  obtained  by  solving  Newton’s  method 
independently  on  levels  3,4,  and  5,  and  we  also  provide  the  ratio  of  the  CPU 
time  for  the  LN  and  SW  algorithms  to  the  CPU  time  for  Newton’s  method 
as  a  percentage,  which  demonstrates  the  minor  cost  of  the  post-processing 
approaches  relative  to  the  cost  of  calculating  the  finite  element  solution. 
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Figure  5  compares  the  LN  &h  obtained  from  the  CG-S  approximation  and  the 
NC  solution  for  by,  after  three  levels  of  refinement.  The  CG-V-LN  solutions 
for  -0  and  &h  on  level  five  are  shown  in  Figure  6.  The  velocity  fields  in  Figure 
5  are  similar  and  close  to  the  CG-V-LN  solution  on  five  levels  of  refinement 
except  for  a  deviation  in  the  CG-S-LN  solution  around  (4. 5, 2. 5).  This  was 
generally  true  of  the  various  method  combinations  listed  in  Table  10  with  the 
exception  of  the  point-wise  evaluated  by,,  and  the  CG-S-SW  by,  on  level  three 
which  had  nearly  an  order  of  magnitude  greater  error  than  the  remaining 
approximations. 

Both  postprocessing  schemes  maintained  local  mass  conservation,  but  the  rel¬ 
ative  expense  of  the  SW  approach  increased  for  Problem  IV.  This  was  par¬ 
ticularly  evident  with  the  stabilized  approximations  (CG-S),  where  between 
two  and  four  times  as  many  iterations  were  required  than  with  the  lumped 
approximation.  Accuracy  in  terms  of  c,^2  was  comparable  among  the  methods, 
given  the  level  of  error  likely  in  the  finest  level  approximation. 

Fig.  4.  Domain  and  initial  mesh  for  Problems  IV  and  V 


6.0)  [m] 
5.5)  [m] 


b  =  0 


0)[m 


I  J 
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Table  10 

Solution  errors,  Problem  IV 


Method 

Level^ 

Ndof 

^mc 

£(7,2 

CG-LN 

3 

1351 

3.55x  10-n 

2.37xl0"2 

1.20x  10_1 

CG-V-LN 

3 

1351 

1.20x  10-8 

3.64x  10”2 

1.22x  10_1 

CG-V-SW 

3 

1351 

l.OOxlO"6 

3.64x  10-2 

1.83x  10_1 

CG-S-PE 

3 

1351 

1.86x  10_1 

6.35x  10-2 

1.94x10° 

CG-S-LN 

3 

1351 

1.08x  10-6 

6.35xl0”2 

1.83x  10_1 

CG-S-SW 

3 

1351 

9.98x  10-7 

6.35xl0”2 

1.18x10° 

NC 

3 

3926 

0 

1.44x  10”2 

1.18x  10_1 

CG-LN 

4 

5277 

2.71xl0~n 

9.45x  10-3 

5.35x  10-2 

CG-V-LN 

4 

5277 

8.00xl0"9 

2.08xl0”2 

5.37xl0-2 

CG-V-SW 

4 

5277 

l.OOxlO"7 

2.08x  10-2 

8.56x  10-2 

CG-S-PE 

4 

5277 

2.45x  10-2 

8.59xl0"3 

2.12x  10_1 

CG-S-LN 

4 

5277 

6.67xl0-7 

8.59xl0"3 

5.72x  10-2 

CG-S-SW 

4 

5277 

l.OOxlO"6 

8.59xl0"3 

l.OlxlO-1 

NC 

4 

15580 

0 

6.27x  10-3 

5.24x  10-2 

f  h  =  0.707 

on  level  3,  h  = 

0.354  on  level  4,  and  h  =  0.177  on  level  5 

Table  11 

CPU  overhead  for  LN  and  SW  algorithms, 

Problem  IV 

Method 

Level 

Its. 

CPU  [s]  %  of  Newton  CPU 

LN 

3 

- 

7.14xl0”3 

0.0734 

CG-V-SW 

3 

651 

4.67xl0_1 

4.80 

CG-S-SW 

3 

1714 

1.20x10° 

12.3 

LN 

4 

- 

3.45xl0-2 

0.106 

CG-V-SW 

4 

586 

1.52x10° 

4.69 

CG-S-SW 

4 

1195 

3.12x10° 

9.63 

LN 

5 

- 

1.56X10”1 

0.274 

CG-V-SW 

5 

140 

1.60x10° 

2.81 

CG-S-SW 

5 

575 

8.15x10° 

14.3 
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Fig.  5.  &h  CG-S-LN  level  3  (left),  NC  (right) 


Fig.  6.  CG-V  level  5  ip  (left),  &h  (right) 


5-4-4  Problem  V 

Problems  IV  and  V  differed  only  in  the  boundary  condition  on  the  right  face  of 
the  domain  and  the  transient  nature  of  Problem  V.  Since  Problem  IV  demon¬ 
strated  the  accuracy  of  the  postprocessing  schemes  for  heterogeneous  prob¬ 
lems,  the  simulations  for  Problem  V  focused  on  the  ability  of  the  CG-S  and 
CG-V  methods  to  resolve  the  infiltration  front  monotonically  and  the  overall 
expense  of  the  LN  and  SW  schemes  when  used  at  each  time  step  of  a  tran¬ 
sient  simulation.  Table  12  shows  mass  conservation  error  and  computational 
effort  for  the  LN  and  SW  algorithms  when  combined  with  the  CG-S  and  CG- 
V  discretizations.  Figures  8  and  9  compare  the  solution  fronts  at  T  =  30  [d] 
for  CG-S  and  CG-V  on  three  levels  of  refinement.  Figure  9  shows  the  CG-V 
solution  for  ip  on  five  levels  of  refinement.  Upwinding  was  not  necessary  for 
the  CG-V  approximation  for  these  refinement  levels,  and  both  solutions  had 
little  or  no  overshoot.  Although  we  did  not  do  a  detailed  comparison  of  the 
NC  solution  to  the  CG-S  and  CG-V  solutions  for  this  problem,  we  did  find 
that  the  NC  solution  had  significant  undershoot. 

The  three-level  CG-V  solution  smeared  the  infiltration  front  enough  to  almost 
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reach  the  lower  right  corner  of  the  physical  domain  at  T  =  30  [d].  On  the 
other  hand,  the  CG-S  solution  smeared  the  infiltration  front  out  significantly 
in  the  horizontal  direction,  so  that  the  leading  edge  of  the  front  lagged  the 
CG-V  solution  along  the  bottom  no-flow  boundary.  The  lateral  smearing  of 
the  CG-S  solution  was  caused  primarily  by  the  isotropic  nature  of  the  shock- 
capturing  diffusion  and  the  relatively  large  value  vc  =  0.5  necessary  to  obtain 
an  oscillation-free  solution. 

The  CPU  times  reported  in  Table  12  include  the  initial  expense  required  for 
building  and  factoring  the  local  node-star  systems  as  well  as  the  per  iteration 
backsubstitutions.  Since  the  mesh  and  boundary  conditions  did  not  change  for 
Problem  V,  the  overall  computational  advantage  for  the  LN  postprocessing 
increased  significantly.  In  addition,  we  note  that  the  SW  algorithm  failed  to 
converge  within  the  maximum  allowed  iterations  for  two  time  steps  on  level 
four  and  failed  for  four  time  steps  on  level  five.  However,  the  maximum  mass 
conservation  error  in  these  cases  was  3.19xlCU5. 


Table  12 

LN  and  SW  performance,  Problem  V 


Method 

Level 

emc  at  T  =  30  [d] 

Avg.  its. 

Avg.  CPUf  [s] 

CG-V-LN 

3 

1.18x  10~7 

- 

1.52x  10-3  (5.62x  10"3) 

CG-V-SW 

3 

9.97xl0-7 

451 

3.13x  10_1 

CG-S-LN 

3 

3.61xl0-7 

- 

1.52x  10-3  (5.62x  10“3) 

CG-S-SW 

3 

l.OOx  10~6 

2289 

1.58x10° 

CG-V-LN 

4 

2.09x  10-7 

- 

9.70x  10-3  (2.49x  10“2) 

CG-V-SW 

4 

9.99x  10-7 

377 

9.74x  10_1 

CG-S-LN 

4 

4.73x  10~8 

- 

9.70x  10-3  (2.49x  10“2) 

CG-S-SW 

4 

l.OOx  10~6 

2072 

5.39x10° 

CG-V-LN 

5 

5.37xl0~7 

- 

4.17xl0-2  (1.14X10”1) 

CG-V-SW 

5 

9.97  x  10~7 

185 

2.13x10° 

CG-S-LN 

5 

1.52x  10~7 

- 

4.17xl0-2  (1.14X10”1) 

CG-S-SW 

5 

l.OOx  10~6 

584 

8.20x10° 

f  CPU  required  to  build  and  factor  LN  node-star  systems  in  parentheses 


5.5  Discussion 

There  were  two  basic  aims  of  this  work.  The  first  was  to  formulate  a  mul¬ 
tiscale,  stabilized  finite  element  approximation  for  Richards’  equation.  The 
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Fig.  7.  0  CG-V  level  3  solution 


x[m] 


second  was  to  detail  two  postprocessing  techniques  that  are  capable  of  pro¬ 
ducing  locally  conservative  velocity  fields  from  traditional  and  stabilized  finite 
element  solutions  for  pressure-head.  The  goal  of  the  numerical  experiments  was 
to  determine  if  the  multiscale-stabilized  strategy  could  provide  improved  ap¬ 
proximations  for  0  and  to  evaluate  the  accuracy  and  computational  efficiency 
of  the  velocity  postprocessing  algorithms. 

The  numerical  experiments  covered  a  range  of  conditions  from  steady-state 
single-phase  flow  to  variably  saturated,  block  heterogeneous  infiltration  prob¬ 
lems.  The  multiscale  stabilization  was  compared  to  a  traditional  conforming 
finite  element  approximation  with  and  without  mass-lumping,  while  the  im¬ 
pact  of  velocity  postprocessing  was  measured  by  comparison  to  direct  evalua¬ 
tion  of  er  via  Darcy’s  law.  As  a  point  of  reference,  we  also  considered  a  locally 
conservative  nonconforming  finite  element  approximation  that  coincides  with 
a  mixed  hybrid  finite  element  approximations  in  many  cases. 

The  velocity  postprocessing  algorithm  from  Larson  and  Niklasson  [17]  com¬ 
bined  with  a  local  RTO  representation  worked  well  with  each  of  the  con¬ 
forming  Galerkin  methods  considered.  Local  mass  conservation  was  always 
obtained,  and  global  accuracy  was  improved.  The  postprocessing  algorithm 
applied  to  piecewise  linear  discretizations  produces  linear  normal  fluxes  on 
element  boundaries,  which  naturally  matches  a  linear  BDM1  or  BDDF1  rep- 
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Fig.  8.  U  CG-S  level  3  solution 
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resentation  on  element  interiors.  While  not  shown,  we  found  that  use  of  a 
local  BDM1  or  BDDF1  representation  improved  accuracy  for  some  cases  such 
as  Problem  I.  In  general,  the  improvements  were  not  sufficient  to  justify  the 
added  storage  (nd  +  rid  versus  rid  +  1  local  degrees  of  freedom),  since  the  post- 
processed  boundary  fluxes  were  only  first  order  accurate  [17].  On  the  other 
hand,  local  projections  onto  higher-order  mixed  spaces  following  the  approach 
laid  out  in  [18]  could  still  be  used  to  obtain  velocity  fields  satisfying  higher 
order  compatibility  conditions  [47].  The  algorithm  is  theoretically  scalable  for 
parallel  implementations. 

The  Sun- Wheeler  postprocessing  scheme  produced  mass-conservative  velocity 
fields  with  accuracy  similar  to  that  of  the  Larson  and  Niklasson  algorithm, 
with  the  exception  of  coarse  grid,  multiscale-stabilized  solutions  to  Problem 
IV.  The  Sun- Wheeler  algorithm  usually  required  more  CPU  time  than  the 
Larson-Niklasson  approach,  and  it  failed  to  converge  in  the  maximum  number 
of  iterations  allowed  for  a  few  (four)  time  steps  in  Problem  V.  The  CPU  time 
required  for  both  methods  was  a  small  fraction  of  the  total  solver  time  in  most 
cases,  however,  and  the  mass  conservation  residual  was  still  small  (at  least 
3.18xl0-5)  when  the  Sun- Wheeler  iterations  failed  to  converge.  The  Sun- 
Wheeler  algorithm  as  presented  is  inherently  sequential  with  a  convergence 
rate  depending  on  the  mesh,  but  implementation  is  quite  simple. 
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Fig.  9.  ip  CG-V  level  5  solution 
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We  have  tried  to  be  comprehensive  in  the  numerical  experiments,  yet  there 
are  several  factors  that  were  not  accounted  for  directly.  For  instance,  the 
Sun- Wheeler  algorithm  requires  minimal  storage,  while  the  Larson- Niklasson 
approach  required  storage  of  the  local,  factored  node-star  systems  in  our 
implementation.  This  memory  overhead  can  be  non-trivial  for  large,  three- 
dimensional  problems.  Similarly,  we  did  not  consider  local  mesh  refinement, 
which  would  have  required  rebuilding  some  of  the  Larson-Niklasson  node-star 
systems  and  likely  increased  its  overhead.  We  also  did  not  include  the  global 
algorithms  from  [17,  18]  which  might  be  competitive  with  a  sufficiently  fast 
linear  solver  [18]. 

The  RTO  velocity  fields  from  the  velocity  postprocessing  schemes  were  typi¬ 
cally  comparable  in  accuracy  to  RTO  velocity  fields  obtained  from  the  non- 
conforming  discretization,  and  were  often  cheaper  to  obtain  in  terms  of  total 
degrees  of  freedom.  While  these  results  do  not  attempt  to  provide  a  compre¬ 
hensive  evaluation  of  the  relative  merits  of  primal  conforming  and  mixed  finite 
element  approximations,  they  do  indicate  that  the  proposed  approach  should 
be  at  least  competitive  with  mixed  methods  for  Richards’  equation. 

The  effectiveness  of  the  multiscale  stabilization  strategy  varied  somewhat. 
For  a  steady-state,  variably  saturated  example  where  the  solution  contained  a 
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sharp  internal  layer  (Problem  III),  it  improved  resolution  over  the  other  meth¬ 
ods  considered  on  coarse  grids.  On  the  other  hand,  its  advantages  over  tradi¬ 
tional  mass-lumping  were  more  limited  for  Problem  V.  Although  the  frame¬ 
work  used  here  is  quite  general,  the  stabilization  and  shock  capturing  parame¬ 
ters  were  evaluated  using  direct  extensions  of  standard  formulas  for  nonlinear 
advection-diffusion-reaction  problems,  and  the  discrete  strong  residual  was  ap¬ 
proximated  using  the  chain-rule  for  simplicity.  These  approaches  clearly  need 
refinement  to  obtain  more  robust  and  accurate  approximations  for  Richards’ 
equation. 
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